Data As the foundation of every graphic, ggplot2 uses data to construct a plot. The system works best if the data is provided in a tidy format, which briefly means a rectangular data frame structure where rows are observations and columns are variables.
As the first step in many plots, you would pass the data to the ggplot() function, which stores the data to be used later by other parts of the plotting system. For example, if we intend to make a graphic about the mpg dataset, we would start as follows:
ggplot(data = mpg)
The mapping of a plot is a set of instructions on how parts of the data are mapped onto aesthetic attributes of geometric objects.
A mapping can be made by using the aes() function to make pairs of graphical attributes and parts of the data.
If we want the cty and hwy columns to map to the x- and y-coordinates in the plot, we can do that as follows:
ggplot(mpg, mapping = aes(x = cty, y = hwy))
Layers convert mapped data into human-readable visuals. Each layer has three components:
geom_*()): Defines visual shapes (points, lines, rectangles).stat_*()): Computes or transforms data for visualization.Example: Two layers displaying cty and hwy from the mpg dataset:
geom_point)geom_smooth)Scales translate visual aesthetics back into data values, guiding interpretation through axes or legends. They handle:
Use scale_{aesthetic}_{type}() functions to define scales.
class column to a viridis color palette:cty column to a log scale:Facets split data into smaller panels (“small multiples”) based on one or more variables. This helps quickly reveal patterns or trends within subsets. Define facets using a formula with functions like facet_grid() or facet_wrap(). Example: using iris dataset to create a scatter plot for each species
year year of manufacturedrv the type of drive train, where f = front-wheel drive, r = rear wheel drive, 4 = 4wdcty city miles per gallonhwy highway miles per gallonThemes control non-data elements, defining the look and feel of a plot (e.g., legend position, background, axis styles).
theme_*()) for quick styling.theme() and element_*() for detailed adjustments.geom_bar()geom_line()geom_boxplot()geom_histogram()geom_density()In this example, we are creating a bar plot of the class column from the mpg dataset. `geom_bar() is counting the number of observations in each class and plotting the result.
What is we already have the counts and want to plot them?
Using Orange dataset:
| Tree | age | circumference |
|---|---|---|
| 1 | 118 | 30 |
| 1 | 484 | 58 |
| 1 | 664 | 87 |
| 1 | 1004 | 115 |
| 1 | 1231 | 120 |
| 1 | 1372 | 142 |
| 1 | 1582 | 145 |
| 2 | 118 | 33 |
| 2 | 484 | 69 |
| 2 | 664 | 111 |
| 2 | 1004 | 156 |
| 2 | 1231 | 172 |
| 2 | 1372 | 203 |
| 2 | 1582 | 203 |
| 3 | 118 | 30 |
| 3 | 484 | 51 |
| 3 | 664 | 75 |
| 3 | 1004 | 108 |
| 3 | 1231 | 115 |
| 3 | 1372 | 139 |
| 3 | 1582 | 140 |
| 4 | 118 | 32 |
| 4 | 484 | 62 |
| 4 | 664 | 112 |
| 4 | 1004 | 167 |
| 4 | 1231 | 179 |
| 4 | 1372 | 209 |
| 4 | 1582 | 214 |
| 5 | 118 | 30 |
| 5 | 484 | 49 |
| 5 | 664 | 81 |
| 5 | 1004 | 125 |
| 5 | 1231 | 142 |
| 5 | 1372 | 174 |
| 5 | 1582 | 177 |
What if we plot each species separately with different colors?
What if we what each histogram to be stacked on top of each other?
What if we what each histogram to in different panels?
What if we what each histogram to in different panels?
library(gridExtra)
p1 <- ggplot(iris, aes(x = Sepal.Length, fill = Species)) +
geom_density(alpha = 0.5) +
facet_grid(Species~.)
p2 <- ggplot(Orange, aes(x = age, y = circumference, color = Tree)) +
geom_line()
p3 <- ggplot(mtcars, aes(x = factor(cyl), y = mpg)) +
geom_boxplot()
grid.arrange(p1, p2, p3, ncol = 2)ggsave() function can be used to save the plot to a file. It takes the following arguments:
filename: name of the file to save the plotplot: the plot to savewidht and height: dimensions of the plotunits: units of the dimensions (default is inches, possible values are “in”, “cm”, “mm”, “px”)device: the device to use for saving the plot (default is “png”, other possible values are “pdf”, “jpeg”, “tiff”, “bmp”)dpi: resolution of the plot in dots per inchR Graph Gallery: https://www.r-graph-gallery.com/ R Gallery Book: https://bookdown.org/content/b298e479-b1ab-49fa-b83d-a57c2b034d49/ R Data Visualization: https://r4ds.had.co.nz/data-visualisation.html